Multimodal Hierarchical Dirichlet Process-based Active Perception

نویسندگان

  • Tadahiro Taniguchi
  • Toshiaki Takano
  • Ryo Yoshino
چکیده

In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an MHDP-based active perception method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback–Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive an efficient Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The results support our theoretical outcomes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Bayesian Hierarchical Framework for Multimodal Active Perception

We will present a Bayesian hierarchical framework for multimodal active perception, devised to be emergent, scalable and adaptive, together with some representative experimental results. This framework, while not strictly neuromimetic, finds its roots in the role of the dorsal perceptual pathway of the human brain. Its composing models build upon a common spatial configuration that is naturally...

متن کامل

Hierarchical Spatial Concept Formation Based on Multimodal Information for Human Support Robots

In this paper, we propose a hierarchical spatial concept formation method based on the Bayesian generative model with multimodal information e.g., vision, position and word information. Since humans have the ability to select an appropriate level of abstraction according to the situation and describe their position linguistically, e.g., "I am in my home" and "I am in front of the table," a hier...

متن کامل

Learning emergent behaviours for a hierarchical Bayesian 3 framework for active robotic perception 4 João

8 Abstract In this research work, we contribute with a 9 behaviour learning process for a hierarchical Bayesian 10 framework for multimodal active perception, devised to be 11 emergent, scalable and adaptive. This framework is com12 posed by models built upon a common spatial configura13 tion for encoding perception and action that is naturally 14 fitting for the integration of readings from mu...

متن کامل

A hierarchical Bayesian framework for multimodal active perception

In this article, we present a hierarchical Bayesian framework for multimodal active perception, devised to be emergent, scalable and adaptive. This framework, while not strictly neuromimetic, finds its roots in the role of the dorsal perceptual pathway of the human brain. Its composing models build upon a common spatial configuration that is naturally fitting for the integration of readings fro...

متن کامل

Bayesian Cognitive Models for 3 D Structure and Motion Multimodal Perception

Humans use various sensory cues to extract crucial information from the environment. With a view of having robots as human companions, we are motivated towards helping to develop a knowledge representation system along the lines of what we know about us. While recent research has shown interesting results, we are still far from having concepts and algorithms that interpret space, coping with th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015